Efficient Pipelining of Nested Loops: Unroll-and-Squash
نویسندگان
چکیده
The size and complexity of current custom VLSI have forced the use of high-level programming languages to describe hardware, and compiler and synthesis technology to map abstract designs into silicon. Many applications operating on large streaming data usually require a custom VLSI because of high performance or low power restrictions. Since the data processing is typically described by loop constructs in a high-level language, loops are the most critical portions of the hardware description and special techniques are developed to optimally synthesize them. In this thesis, we introduce a new method for mapping nested loops into hardware and pipelining them efficiently. The technique achieves fine-grain parallelism even on strong intraand inter-iteration datadependent inner loops and, by economically sharing resources, improves performance at the expense of a small amount of additional area. We implemented the transformation within the Nimble Compiler environment and evaluated its performance on several signal-processing benchmarks. The method achieves up to 2x increase in the area efficiency compared to the best known optimization techniques. Thesis Supervisors: Randolph E. Harr Director of Research, Advanced Technology Group, Synopsys, Inc. Saman P. Amarasinghe Assistant Professor, MIT Laboratory for Computer Science
منابع مشابه
Improving Software Pipelining with Unroll-and-Jam
To take advantage of recent architectural improvements in micropr&essors, advanced compiler optimizations such as software pipelining have been developed [I, 2, 3, 41. Unfortunately, not all loops have enough parallelism in the innermost loop body to take advantage of all of the resources a machine provides. Unroll-and-jam is a transformation that can be used to increase the amount of paralleli...
متن کاملSoftware pipelining of nested loops for real-time DSP applications
Modem DSP Processors have been integrated with InsrrucrionLevel Purullelism(ILP), which presents a challenge to exploit ILP within DSP applications. Software Pipelining is an efficient tcchnique used to expose ILP for loop programs and has been widely used for current microprocessors. It has been recently used in DSP compilers, but only for the innermost loops. This paper proposes a new approac...
متن کاملOptimal Software Pipelining of Nested Loops
This paper presents an approach to software pipelining of nested loops. While several papers have addressed software pipelining of single (non-nested) loops, little work has been done in the area of applying it to nested loops. This paper solves the problem of nding the minimum iteration initiation interval (in the absence of resource constraints) for each level of a nested loop. The problem is...
متن کاملSoftware pipelining of nested loops
This paper presents an approach to software pipelining of nested loops. While several papers have addressed software pipelining of inner loops, little work has been done in the area of extending it to nested loops. This paper solves the problem of nding the minimum iteration initiation interval (in the absence of resource constraints) for each level of a nested loop. The problem is formulated a...
متن کاملProgram Parallelization Using Synchronized Pipelining
While there are well-understood methods for detecting loops whose iterations are independent and parallelizing them, there are comparatively fewer proposals that support parallel execution of a sequence of loops or nested loops in the case where such loops have dependencies among them. This paper introduces a refined notion of independence, called eventual independence, that in its simplest for...
متن کامل